10 research outputs found

    A Masked Face Classification Benchmark on Low-Resolution Surveillance Images

    Full text link
    We propose a novel image dataset focused on tiny faces wearing face masks for mask classification purposes, dubbed Small Face MASK (SF-MASK), composed of a collection made from 20k low-resolution images exported from diverse and heterogeneous datasets, ranging from 7 x 7 to 64 x 64 pixel resolution. An accurate visualization of this collection, through counting grids, made it possible to highlight gaps in the variety of poses assumed by the heads of the pedestrians. In particular, faces filmed by very high cameras, in which the facial features appear strongly skewed, are absent. To address this structural deficiency, we produced a set of synthetic images which resulted in a satisfactory covering of the intra-class variance. Furthermore, a small subsample of 1701 images contains badly worn face masks, opening to multi-class classification challenges. Experiments on SF-MASK focus on face mask classification using several classifiers. Results show that the richness of SF-MASK (real + synthetic images) leads all of the tested classifiers to perform better than exploiting comparative face mask datasets, on a fixed 1077 images testing set. Dataset and evaluation code are publicly available here: https://github.com/HumaticsLAB/sf-maskComment: 15 pages, 7 figures. Accepted at T-CAP workshop @ ICPR 202

    Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language

    Full text link
    We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for Visual Navigation with natural language query prompts. The recently proposed RNR-Map employs a grid structure comprising latent codes positioned at each pixel. These latent codes, which are derived from image observation, enable: i) image rendering given a camera pose, since they are converted to Neural Radiance Field; ii) image navigation and localization with astonishing accuracy. On top of this, we enhance RNR-Map with CLIP-based embedding latent codes, allowing natural language search without additional label data. We evaluate the effectiveness of this map in single and multi-object searches. We also investigate its compatibility with a Large Language Model as an "affordance query resolver". Code and videos are available at https://intelligolabs.github.io/Le-RNR-Map/Comment: Accepted at ICCVW23 VLA

    A Machine Learning-oriented Survey on Tiny Machine Learning

    Full text link
    The emergence of Tiny Machine Learning (TinyML) has positively revolutionized the field of Artificial Intelligence by promoting the joint design of resource-constrained IoT hardware devices and their learning-based software architectures. TinyML carries an essential role within the fourth and fifth industrial revolutions in helping societies, economies, and individuals employ effective AI-infused computing technologies (e.g., smart cities, automotive, and medical robotics). Given its multidisciplinary nature, the field of TinyML has been approached from many different angles: this comprehensive survey wishes to provide an up-to-date overview focused on all the learning algorithms within TinyML-based solutions. The survey is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodological flow, allowing for a systematic and complete literature survey. In particular, firstly we will examine the three different workflows for implementing a TinyML-based system, i.e., ML-oriented, HW-oriented, and co-design. Secondly, we propose a taxonomy that covers the learning panorama under the TinyML lens, examining in detail the different families of model optimization and design, as well as the state-of-the-art learning techniques. Thirdly, this survey will present the distinct features of hardware devices and software tools that represent the current state-of-the-art for TinyML intelligent edge applications. Finally, we discuss the challenges and future directions.Comment: Article currently under review at IEEE Acces

    IoT Systems for Healthy and Safe Life Environments

    Get PDF
    The past two years have been sadly marked by the worldwide spread of the SARS-Cov-19 pandemic. The first line of defense against this and other pandemic threats is to respect interpersonal distances, use masks, and sanitize hands, air, and objects. Some of these countermeasures are becoming part of our daily lives, as they are now considered good practices to reduce the risk of infection and contagion. In this context, we present \emph{Safe Place}, a modular system enabled by \gls{iot} that is designed to improve the safety and healthiness of living environments. %\textcolor{blue}{ This system combines several sensors and actuators produced by different vendors with self-regulating procedures and \gls{ai} algorithms to limit the spread of viruses and other pathogens, and increase the quality and comfort offered to people while minimizing the energy consumption.%} We discuss the main objectives of the system and its implementation, showing preliminary results that assess its potentials in enhancing the conditions of living and working spaces

    Split-Et-Impera: A Framework for the Design of Distributed Deep Learning Applications

    No full text
    Many recent pattern recognition applications rely on complex distributed architectures in which sensing and computational nodes interact together through a communication network. Deep neural networks (DNNs) play an important role in this scenario, furnishing powerful decision mechanisms, at the price of a high computational effort. Consequently, powerful state-of-the-art DNNs are frequently split over various computational nodes, e.g., a first part stays on an embedded device and the rest on a server. Deciding where to split a DNN is a challenge in itself, making the design of deep learning applications even more complicated. Therefore, we propose Split-Et-Impera, a novel and practical framework that i) determines the set of the best-split points of a neural network based on deep network interpretability principles without performing a tedious try-and-test approach, ii) performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements, and iii) suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time

    I-SPLIT: Deep Network Interpretability for Split Computing

    No full text
    This work makes a substantial step in the field of split computing, i.e., how to split a deep neural network to host its early part on an embedded device and the rest on a server. So far, potential split locations have been identified exploiting uniquely architectural aspects, i.e., based on the layer sizes. Under this paradigm, the efficacy of the split in terms of accuracy can be evaluated only after having performed the split and retrained the entire pipeline, making an exhaustive evaluation of all the plausible splitting points prohibitive in terms of time. Here we show that not only the architecture of the layers does matter, but the importance of the neurons contained therein too. A neuron is important if its gradient with respect to the correct class decision is high. It follows that a split should be applied right after a layer with a high density of important neurons, in order to preserve the information flowing until then. Upon this idea, we propose Interpretable Split (I-SPLIT): a procedure that identifies the most suitable splitting points by providing a reliable prediction on how well this split will perform in terms of classification accuracy, beforehand of its effective implementation. As a further major contribution of I-SPLIT, we show that the best choice for the splitting point on a multiclass categorization problem depends also on which specific classes the network has to deal with. Exhaustive experiments have been carried out on two networks, VGG16 and ResNet-50, and three datasets, Tiny-Imagenet-200, notMNIST, and Chest X-Ray Pneumonia. The source code is available at https://github.com/vips4/I-Split

    SCENE-pathy: Capturing the Visual Selective Attention of People Towards Scene Elements

    No full text
    We present SCENE-pathy, a dataset and a set of baselines to study the visual selective attention (VSA) of people towards the 3D scene in which they are located. In practice, VSA allows to discover which parts of the scene are most attractive for an individual. Capturing VSA is of primary importance in the fields of marketing, retail management, surveillance, and many others. So far, VSA analysis focused on very simple scenarios: a mall shelf or a tiny room, usually with a single subject involved. Our dataset, instead, considers a multi-person and much more complex 3D scenario, specifically a high-tech fair showroom presenting machines of an Industry 4.0 production line, where 25 subjects have been captured for 2 min each when moving, observing the scene, and having social interactions. Also, the subjects filled out a questionnaire indicating which part of the scene was most interesting for them. Data acquisition was performed using Hololens 2 devices, which allowed us to get ground-truth data related to people's tracklets and gaze trajectories. Our proposed baselines capture VSA from the mere RGB video data and a 3D scene model, providing interpretable 3D heatmaps. In total, there are more than 100K RGB frames with, for each person, the annotated 3D head positions and the 3D gaze vectors. The dataset is available here: https://intelligolabs.github.io/scene-pathy

    Pose Forecasting in Industrial Human-Robot Collaboration

    No full text
    Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ∼4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 s in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO ). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 ms, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64

    I-MALL An Effective Framework for Personalized Visits. Improving the Customer Experience in Stores

    No full text
    In this paper we present I-MALL, an ICT hardware and software infrastructure that enables the management of services related to places such as shopping malls, showrooms, and conferences held in dedicated facilities. I-MALL offers a network of services that perform customer behavior analysis through computer vision and provide personalized recommendations made available on digital signage terminals. The user can also interact with a social robot. Recommendations are inferred on the basis of the profile of interests computed by the system analysing the history of the customer visit and his/her behavior including information from his/her appearance, the route taken inside the facility, as well as his/her mood and gaze

    The Post-pandemic Effects on IoT for Safety: The Safe Place Project

    No full text
    COVID-19 had substantial effects on the IoT community which designs systems for safety: the urge to face masks worn by everyone, the analysis of crowds to avoid the spread of the disease, and the sanitization of public environments has led to exceptional research acceleration and fast engineering of the related solutions. Now that the pandemic is losing power, some applications are becoming less important, while others are proving to be useful regardless of the criticality of COVID-19. The Safe Place project is a prime example of this situation (DATE23 MPP category: final stage). Safe Place is an Italian 3M euro regional industrial/academic project, financed by European funds, created to ensure a multidisciplinary choral reaction to COVID-19 in critical environments such as rest homes and public places. Safe Place consortium was able to understand what is no longer useful in this post-pandemic period, and what instead is potentially attractive for the market. For example, the detection of face masks has little importance, while sanitization does have much. This paper shares such analysis, which emerged through a co-design process of three public Safe Place project demonstrators, involving heterogeneous figures spanning from scientists to lawyers
    corecore